在许多工程应用中,例如雷达/声纳/超声成像等许多工程应用中,稀疏多通道盲卷(S-MBD)的问题经常出现。为了降低其计算和实施成本,我们提出了一种压缩方法,该方法可以及时从更少的测量值中进行盲目恢复。提出的压缩通过过滤器随后进行亚采样来测量信号,从而大大降低了实施成本。我们得出理论保证,可从压缩测量中识别和回收稀疏过滤器。我们的结果允许设计广泛的压缩过滤器。然后,我们提出了一个由数据驱动的展开的学习框架,以学习压缩过滤器并解决S-MBD问题。编码器是一个经常性的推理网络,该网络将压缩测量结果映射到稀疏过滤器的估计值中。我们证明,与基于优化的方法相比,我们展开的学习方法对源形状的选择更为强大,并且具有更好的恢复性能。最后,在具有有限数据的应用程序(少数图)的应用中,我们强调了与传统深度学习相比,展开学习的卓越概括能力。
translated by 谷歌翻译
将数据作为几个原子的组合表示数据的字典学习问题,长期以来作为一种流行的学习统计信息和信号处理方法。最受欢迎的字典学习算法在稀疏编码和词典上的交替交替,富有的文献研究了其理论融合。神经卓越的展开稀疏编码网络的日益普及导致了经验发现,通过这种网络的反向化执行字典学习。本文通过Pudle提供了这些经验结果的第一个理论证明,可提供展开的展开字典学习方法。我们突出了损失,展开和背交对融合的影响。我们发现隐式加速:作为展开的函数,BackPropagated梯度会收敛得更快,比梯度从交替最小化更准确。我们通过合成和图像去噪实验补充我们的研究结果。调查结果支持使用加速深度学习优化器和展开网络用于字典学习。
translated by 谷歌翻译
卷积字典学习(CDL),估计来自数据的移位不变模板的问题,通常在模板上的先前/结构的情况下进行。在数据稀缺或低信噪比(SNR)制度中,学习模板会过度提供数据并缺乏平滑,这可能影响下游任务的预测性能。为了解决此限制,我们提出了GPCDL,一个卷积字典学习框架,该卷积字典学习框架在使用高斯过程(GPS)上强制对模板上的前提。随着对光滑度的重点,理论上,施加GP的理论上是等同于维纳滤波学习模板的维纳,从而抑制了高频分量并促进了平滑度。我们表明该算法是经典迭代重新重量最小二乘算法的简单扩展,与GP内核的选择无关。此属性允许有一个以不同的平滑度假设灵活实验。通过仿真,我们表明GPCDL学习顺利的词典,比在一系列SNR中的不断传扰的替代方案更好的准确性。通过应用于神经尖峰数据,我们表明GPCDL与非正规化CDL相比,GPCDL了解更准确和视觉可解释的顺利字典,导致卓越的预测性能,以及参数化。
translated by 谷歌翻译
Discriminative features extracted from the sparse coding model have been shown to perform well for classification. Recent deep learning architectures have further improved reconstruction in inverse problems by considering new dense priors learned from data. We propose a novel dense and sparse coding model that integrates both representation capability and discriminative features. The model studies the problem of recovering a dense vector $\mathbf{x}$ and a sparse vector $\mathbf{u}$ given measurements of the form $\mathbf{y} = \mathbf{A}\mathbf{x}+\mathbf{B}\mathbf{u}$. Our first analysis proposes a geometric condition based on the minimal angle between spanning subspaces corresponding to the matrices $\mathbf{A}$ and $\mathbf{B}$ that guarantees unique solution to the model. The second analysis shows that, under mild assumptions, a convex program recovers the dense and sparse components. We validate the effectiveness of the model on simulated data and propose a dense and sparse autoencoder (DenSaE) tailored to learning the dictionaries from the dense and sparse model. We demonstrate that (i) DenSaE denoises natural images better than architectures derived from the sparse coding model ($\mathbf{B}\mathbf{u}$), (ii) in the presence of noise, training the biases in the latter amounts to implicitly learning the $\mathbf{A}\mathbf{x} + \mathbf{B}\mathbf{u}$ model, (iii) $\mathbf{A}$ and $\mathbf{B}$ capture low- and high-frequency contents, respectively, and (iv) compared to the sparse coding model, DenSaE offers a balance between discriminative power and representation.
translated by 谷歌翻译
Variational autoencoders (VAEs) are powerful tools for learning latent representations of data used in a wide range of applications. In practice, VAEs usually require multiple training rounds to choose the amount of information the latent variable should retain. This trade-off between the reconstruction error (distortion) and the KL divergence (rate) is typically parameterized by a hyperparameter $\beta$. In this paper, we introduce Multi-Rate VAE (MR-VAE), a computationally efficient framework for learning optimal parameters corresponding to various $\beta$ in a single training run. The key idea is to explicitly formulate a response function that maps $\beta$ to the optimal parameters using hypernetworks. MR-VAEs construct a compact response hypernetwork where the pre-activations are conditionally gated based on $\beta$. We justify the proposed architecture by analyzing linear VAEs and showing that it can represent response functions exactly for linear VAEs. With the learned hypernetwork, MR-VAEs can construct the rate-distortion curve without additional training and can be deployed with significantly less hyperparameter tuning. Empirically, our approach is competitive and often exceeds the performance of multiple $\beta$-VAEs training with minimal computation and memory overheads.
translated by 谷歌翻译
Faced with the threat of identity leakage during voice data publishing, users are engaged in a privacy-utility dilemma when enjoying convenient voice services. Existing studies employ direct modification or text-based re-synthesis to de-identify users' voices, but resulting in inconsistent audibility in the presence of human participants. In this paper, we propose a voice de-identification system, which uses adversarial examples to balance the privacy and utility of voice services. Instead of typical additive examples inducing perceivable distortions, we design a novel convolutional adversarial example that modulates perturbations into real-world room impulse responses. Benefit from this, our system could preserve user identity from exposure by Automatic Speaker Identification (ASI) while remaining the voice perceptual quality for non-intrusive de-identification. Moreover, our system learns a compact speaker distribution through a conditional variational auto-encoder to sample diverse target embeddings on demand. Combining diverse target generation and input-specific perturbation construction, our system enables any-to-any identify transformation for adaptive de-identification. Experimental results show that our system could achieve 98% and 79% successful de-identification on mainstream ASIs and commercial systems with an objective Mel cepstral distortion of 4.31dB and a subjective mean opinion score of 4.48.
translated by 谷歌翻译
培训低级的深层神经网络,即使用分解层,特别是社区感兴趣的:它在记忆消耗和训练时间方面提供了对未分离培训的效率。先前的工作集中在预训练的网络的低级近似值和低级空间中的培训中,并提供了其他目标,为所选实践提供了各种临时解释。我们分析了在实践中运作良好的技术,并通过对诸如GPT2之类的模型进行广泛的消融,我们提供了证据表明该领域的共同信念,这暗示着令人兴奋的研究机会仍然需要回答。
translated by 谷歌翻译
重建准确且一致的大规模激光点云图对机器人应用至关重要。现有的解决方案姿势图优化虽然是及时的,但并未直接优化映射一致性。最近提出了LIDAR捆绑调整(BA)来解决此问题;但是,它在大规模地图上太耗时了。为了减轻此问题,本文介绍了适合大规模地图的全球一致和有效的映射方法。我们提出的工作包括自下而上的分层BA和自上而下的姿势图优化,结合了这两种方法的优势。通过层次设计,我们解决了比原始BA小得多的Hessian矩阵大小的多个BA问题。借助姿势图优化,我们可以平稳有效地更新LiDAR姿势。我们提出的方法的有效性和鲁棒性已在多个空间和及时的大规模公共旋转雷达数据集上得到验证,即Kitti,Mulran和Newer College,以及在结构化和非结构化场景下进行自我收集的固态LIDAR数据集。通过适当的设置,我们证明我们的工作可以生成全球一致的地图,约有序列时间的12%。
translated by 谷歌翻译
束调整(BA)是指同时确定传感器姿势和场景几何形状的问题,这是机器人视觉中的一个基本问题。本文为LIDAR传感器提供了一种有效且一致的捆绑捆绑调整方法。该方法采用边缘和平面特征来表示场景几何形状,并直接最大程度地减少从每个原始点到各自几何特征的天然欧几里得距离。该公式的一个不错的属性是几何特征可以在分析上解决,从而大大降低了数值优化的维度。为了更有效地表示和解决最终的优化问题,本文提出了一个新颖的概念{\ it point clusters},该概念编码了通过一组紧凑的参数集与同一特征相关联的所有原始点,{\ it点群集坐标} 。我们根据点簇坐标得出BA优化的封闭形式的衍生物,并显示其理论属性,例如零空间和稀疏性。基于这些理论结果,本文开发了有效的二阶BA求解器。除了估计LiDAR姿势外,求解器还利用二阶信息来估计测量噪声引起的姿势不确定性,从而导致对LIDAR姿势的一致估计。此外,由于使用点群集的使用,开发的求解器从根本上避免了在优化的所有步骤中列出每个原始点(由于数量大量而非常耗时):成本评估,衍生品评估和不确定性评估。我们的方法的实施是开源的,以使机器人界及其他地区受益。
translated by 谷歌翻译
激光射道是激光雷达同时定位和映射(SLAM)的重要部分之一。但是,现有的LiDAR探光法倾向于将新的扫描与以前的固定置扫描相匹配,并逐渐累积错误。此外,作为一种有效的关节优化机制,由于大规模全球地标的密集计算,捆绑捆绑调整(BA)不能直接引入实时探光仪。因此,这封信设计了一种新策略,称为LINDAR SLAM中的捆绑调节探针仪(LMBAO)的具有里程碑意义的地图,以解决这些问题。首先,通过主动地标维护策略进一步开发了基于BA的进程法,以进行更准确的本地注册并避免累积错误。具体来说,本文将整个稳定地标在地图上保存,而不仅仅是在滑动窗口中的特征点,并根据其主动等级删除地标。接下来,减小滑动窗口长度,并执行边缘化以保留窗口外的扫描,但对应于地图上的活动地标,从而大大简化了计算并改善了实时属性。此外,在三个具有挑战性的数据集上进行的实验表明,我们的算法在户外驾驶中实现了实时性能,并且超过了最先进的激光雷达大满贯算法,包括乐高乐园和VLOM。
translated by 谷歌翻译